Non-Gaussian Component Analysis: a Semi-parametric Framework for Linear Dimension Reduction

نویسندگان

  • Gilles Blanchard
  • Masashi Sugiyama
  • Motoaki Kawanabe
  • Vladimir G. Spokoiny
  • Klaus-Robert Müller
چکیده

We propose a new linear method for dimension reduction to identify nonGaussian components in high dimensional data. Our method, NGCA (non-Gaussian component analysis), uses a very general semi-parametric framework. In contrast to existing projection methods we define what is uninteresting (Gaussian): by projecting out uninterestingness, we can estimate the relevant non-Gaussian subspace. We show that the estimation error of finding the non-Gaussian components tends to zero at a parametric rate. Once NGCA components are identified and extracted, various tasks can be applied in the data analysis process, like data visualization, clustering, denoising or classification. A numerical study demonstrates the usefulness of our method.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Novel Dimension Reduction Procedure for Searching Non-Gaussian Subspaces

In this article, we consider high-dimensional data which contains a low-dimensional non-Gaussian structure contaminated with Gaussian noise and propose a new linear method to identify the non-Gaussian subspace. Our method NGCA (Non-Gaussian Component Analysis) is based on a very general semiparametric framework and has a theoretical guarantee that the estimation error of finding the non-Gaussia...

متن کامل

Likelihood Component Analysis

Independent component analysis (ICA) is popular in many applications, including cognitive neuroscience and signal processing. Due to computational constraints, principal component analysis is used for dimension reduction prior to ICA (PCA+ICA), which could remove important information. The problem is that interesting independent components (ICs) could be mixed in several principal components th...

متن کامل

Distance metric learning by minimal distance maximization

Classic linear dimensionality reduction (LDR) methods, such as principal component analysis (PCA) and linear discriminant analysis (LDA), are known not to be robust against outliers. Following a systematic analysis of the multi-class LDR problem in a unified framework, we propose a new algorithm, called minimal distance maximization (MDM), to address the non-robustness issue. The principle behi...

متن کامل

Linear Dependent Dimensionality Reduction

We formulate linear dimensionality reduction as a semi-parametric estimation problem, enabling us to study its asymptotic behavior. We generalize the problem beyond additive Gaussian noise to (unknown) nonGaussian additive noise, and to unbiased non-additive models.

متن کامل

Semi-supervised learning with Gaussian fields

Gaussian fields (GF) have recently received considerable attention for dimension reduction and semi-supervised classification. This paper presents two contributions. First, we show how the GF framework can be used for regression tasks on high-dimensional data. We consider an active learning strategy based on entropy minimization and a maximum likelihood model selection method. Second, we show h...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005